|
Empirical risk minimization (ERM) is a principle in statistical learning theory which defines a family of learning algorithms and is used to give theoretical bounds on the performance of learning algorithms. == Background == Consider the following situation, which is a general setting of many supervised learning problems. We have two spaces of objects and and would like to learn a function (often called ''hypothesis'') which outputs an object , given . To do so, we have at our disposal a ''training set'' of a few examples where is an input and is the corresponding response that we wish to get from . To put it more formally, we assume that there is a joint probability distribution over and , and that the training set consists of instances drawn i.i.d. from . Note that the assumption of a joint probability distribution allows us to model uncertainty in predictions (e.g. from noise in data) because is not a deterministic function of , but rather a random variable with conditional distribution for a fixed . We also assume that we are given a non-negative real-valued loss function which measures how different the prediction of a hypothesis is from the true outcome . The risk associated with hypothesis is then defined as the expectation of the loss function: : A loss function commonly used in theory is the 0-1 loss function: , where is the indicator notation. The ultimate goal of a learning algorithm is to find a hypothesis among a fixed class of functions for which the risk is minimal: : 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「empirical risk minimization」の詳細全文を読む スポンサード リンク
|